Can We Identify Individuals at Risk to Develop Multiple Myeloma? a Machine Learning Based Predictive Model

Mittelman, Moshe; Oster, Howard S.; Ben Shlomo, Yatir; Israel, Ariel; Jarchowcky Dolberg, Osnat; Hayek, Samah; Leshchinsky, Michael; Kepten, Eldad; Balicer, Ran; Shaham, Galit

doi:10.1182/blood-2022-162438

Moshe Mittelman,

Moshe Mittelman

1Department of Hematology, Tel Aviv Sourasky Medical Center, Tel Aviv, Israel

2Sackler Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel

Search for other works by this author on:

This Site

PubMed

Google Scholar

Howard S. Oster,

Howard S. Oster

3Sackler Faculty of Medicine, Tel-Aviv University, Tel-Aviv, Israel

4Department of Internal Medicine A, Tel-Aviv Sourasky Medical Center, Tel-Aviv, Israel

Search for other works by this author on:

This Site

PubMed

Google Scholar

Yatir Ben Shlomo,

Yatir Ben Shlomo *

5Clalit Research Institute, Innovation Division, Clalit Health Services, Tel Aviv, Israel

Search for other works by this author on:

This Site

PubMed

Google Scholar

Ariel Israel,

Ariel Israel *

6Leumit Health Services, Tel Aviv, Israel

Search for other works by this author on:

This Site

PubMed

Google Scholar

Osnat Jarchowcky Dolberg,

Osnat Jarchowcky Dolberg *

7Hematology Department, Meir Medical Center, Kfar Saba, Israel

Search for other works by this author on:

This Site

PubMed

Google Scholar

Samah Hayek,

Samah Hayek *

5Clalit Research Institute, Innovation Division, Clalit Health Services, Tel Aviv, Israel

Search for other works by this author on:

This Site

PubMed

Google Scholar

Michael Leshchinsky,

Michael Leshchinsky *

5Clalit Research Institute, Innovation Division, Clalit Health Services, Tel Aviv, Israel

Search for other works by this author on:

This Site

PubMed

Google Scholar

Eldad Kepten,

Eldad Kepten *

5Clalit Research Institute, Innovation Division, Clalit Health Services, Tel Aviv, Israel

Search for other works by this author on:

This Site

PubMed

Google Scholar

Ran Balicer,

Ran Balicer *

5Clalit Research Institute, Innovation Division, Clalit Health Services, Tel Aviv, Israel

Search for other works by this author on:

This Site

PubMed

Google Scholar

Galit Shaham

Galit Shaham *

5Clalit Research Institute, Innovation Division, Clalit Health Services, Tel Aviv, Israel

Search for other works by this author on:

This Site

PubMed

Google Scholar

Abstract

Background: Multiple myeloma (MM) evolves over years. Pre-MM states, such as monoclonal gammopathy of undetermined significance (MGUS) and smoldering MM, are asymptomatic and often missed. When active MM is diagnosed, it is often associated with organ damage. We hypothesized that applying a machine learning (ML) approach on electronic medical records (EMR) can help in developing a predictive model that identifies population at most risk for MM.

Methods: An observational retrospective study was performed using data extracted from the Clalit Health Services (CHS) EMR. CHS insures 4.7x10⁶ individuals (53% of the Israeli population). The study included CHS members diagnosed with MM between 2002 and 2019 and their controls. First, we compared numerous clinical and lab parameters of MM patients in the pre-MM period (5yr to 2m prior to diagnosis) to controls. Then, a ML approach was used to develop a risk prediction model using a gradient boosting algorithm The unit of analysis was a patient who underwent a blood test at a given month. The training set included units from "future MM" patients and from matched controls. Model performance was evaluated on a separate test set including blood tests performed in the year 2014 by patients who were not included in the training set and were not diagnosed with MM at the time of their blood tests. Lastly, a simplified model was constructed by excluding MM-specific variables and applying a logistic regression.

Results: We identified 4982 MM patients, of whom 4256 had the relevant lab tests and were therefore eligible for comparison. In the pre-MM period, "future MM" patients had higher ESR, lower Hb, neutrophil count (ANC) and Neutrophil/Lymphocyte ratio, and higher levels of serum globulins, urinary protein, serum IgG and ferritin, than controls. They tended, more than controls, to suffer from immune deficiencies, as well as myelodysplastic syndromes and familial Mediterranean fever. Consumption of medications (tranquilizers, anti-diabetics, Ca-antagonists, statins) was associated with reduced risk for MM. The gradient boosting predictive model was developed using 19,129 learning units of MM cases and 382,580 controls. The test set included 268,058 blood tests, 368 of these (0.14%) belonged to patients who were diagnosed with MM within 5yr. The performance of the model was good, with an area under the curve (AUC) of 0.836. Ranges of MM predictors stratify the risk of developing the disease in the future (Figure 1). The simplified logistic regression model included 9 parameters: age, sex, uric acid, LDH, RBC, lymphocyte %, ANC, HDL and Non-HDL-cholesterol. It had an AUC of 0.794. An example of its use is provided in Table 1.

Conclusions: Using a large database and a ML approach, we were able to develop a predictive model of MM risk. Taking only a few, widely available parameters, a predicted MM risk can be provided for any individual performing simple blood tests. The model can be used as a first-line screening tool, pointing clinicians to the individuals most at risk for MM, allowing them to focus further workup accordingly.

Figure 1

View large Download PPT

Disclosures

Mittelman:Novartis: Research Funding; Takeda: Honoraria, Research Funding; Janssen: Research Funding; Roche: Research Funding; BMS: Research Funding; Celgene: Research Funding; Medison: Research Funding; Gilead: Research Funding.

Author notes

Asterisk with author names denotes non-ASH members.

2022

Can We Identify Individuals at Risk to Develop Multiple Myeloma? a Machine Learning Based Predictive Model

Abstract

Disclosures

Author notes

Contents

Data & Figures

Supplemental data

References

Cited By

Email alerts

ASH Publications

American Society of Hematology

Can We Identify Individuals at Risk to Develop Multiple Myeloma? a Machine Learning Based Predictive Model Free

Abstract

Disclosures

Author notes

Contents

Data & Figures

Supplemental data

References

Related

Related

Cited By

Email alerts

ASH Publications

American Society of Hematology

This Feature Is Available To Subscribers Only

Can We Identify Individuals at Risk to Develop Multiple Myeloma? a Machine Learning Based Predictive Model